Sentence Clustering via Projection over Term Clusters
نویسندگان
چکیده
This paper presents a novel sentence clustering scheme based on projecting sentences over term clusters. The scheme incorporates external knowledge to overcome lexical variability and small corpus size, and outperforms common sentence clustering methods on two reallife industrial datasets.
منابع مشابه
A Comparative Study of Word Co-occurrence for Term Clustering in Language Model-based Sentence Retrieval
Sentence retrieval is a very important part of question answering systems. Term clustering, in turn, is an effective approach for improving sentence retrieval performance: the more similar the terms in each cluster, the better the performance of the retrieval system. A key step in obtaining appropriate word clusters is accurate estimation of pairwise word similarities, based on their tendency t...
متن کاملEffective Concept-Based Mining Model For Text Clustering
The common techniques in text mining are based on the statistical analysis of a term, either word or phrase. Statistical analysis of a term frequency captures the importance of the term within a document only. Two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Usually in text mining techniques the basic me...
متن کاملEnhancing Information Retrieval using Concept- Based Mining Model with Feature Extraction and Clustering
Most of the common techniques in text mining are based on the statistical analysis of a term, either word or phrase. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Thus, the underlying text min...
متن کاملClustering Sentence-Level Text Using a Fuzzy Back- Propagation Clustering Algorithm
In comparison with hard clustering methods, in which a pattern belongs to a unique cluster, clustering algorithms with fuzziness allow patterns with differing degrees of membership to belong to all clusters. This is important in domains such as sentence clustering, as a sentence may belong to more than a topic present within a document or set of documents. Since most sentence similarity measure...
متن کاملCluster-Based Language Model for Sentence Retrieval in Chinese Question Answering
Sentence retrieval plays a very important role in question answering system. In this paper, we present a novel cluster-based language model for sentence retrieval in Chinese question answering which is motivated in part by sentence clustering and language model. Sentence clustering is used to group sentences into clusters. Language model is used to properly represent sentences, which is combine...
متن کامل